Non-relevance Feedback Document Retrieval using Large Data Set

نویسندگان

  • Hiroshi Murata
  • Takashi Onoda
  • Seiji Yamada
چکیده

In interactive document retrieval, we need to find relevant documents to our interest from a large data set of documents, within a few iterations of judgement on retrieved documents. In each iteration, a comparatively small batch of documents is evaluated to establish their relevance to user’s interest. This method is also called relevance feedback, and it requires both of relevant and non-relevant documents. However, the documents initially presented for user’s judgement do not always include relevant documents. Thus we have proposed a feedback method using information on non-relevant documents only and named this method “non-relevance feedback”. Nonrelevance feedback selects a set of documents which are discriminated not non-relevant region and are near the discriminant hyperplane based on learning result by One-class Support Vector Machine (One-class SVM). We conducted experiments using large data sets including over 500,000 newspaper articles and confirmed that the proposed method outperformed other methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Document Image Retrieval Based on Keyword Spotting Using Relevance Feedback

Keyword Spotting is a well-known method in document image retrieval. In this method, Search in document images is based on query word image. In this Paper, an approach for document image retrieval based on keyword spotting has been proposed. In proposed method, a framework using relevance feedback is presented. Relevance feedback, an interactive and efficient method is used in this paper to imp...

متن کامل

An One Class Classification Approach to Non-relevance Feedback Document Retrieval

This paper reports a new document retrieval method using non-relevant documents. From a large data set of documents, we need to find documents that relate to human interesting in as few iterations of human testing or checking as possible. In each iteration a comparatively small batch of documents is evaluated for relating to the human interesting. The relevance feedback needs a set of relevant ...

متن کامل

Relevance Feedback Document Retrieval using Non-Relevant Documents

This paper reports a new document retrieval method using non-relevant documents. From a large data set of documents, we need to find documents that relate to human interesting in as few iterations of human testing or checking as possible. In each iteration a comparatively small batch of documents is evaluated for relating to the human interesting. This method is called relevance feedback. The r...

متن کامل

CLEF-2005 CL-SR at Maryland: Document and Query Expansion using Side Collections and Thesauri

This paper reports results for the University of Maryland’s participation in CLEF-2005 Cross-Language Speech Retrieval track. Techniques that were tried include: (1) document expansion with manually created metadata (thesaurus keywords and segment summaries) from a large side collection, (2) query refinement with pseudo-relevance feedback, (3) keyword expansion with thesaurus synonyms, and (4) ...

متن کامل

UMass Genomics 2006: Query-Biased Pseudo Relevance Feedback

Query-biased pseudo relevance feedback creates document representations for document feedback that aim to be more relevant to the user than using the entire document. Our submitted runs using querybiased feedback degraded performance compared to not using feedback. The cause of this degradation was the use of too many documents for feedback. Preliminary document retrieval experiments using fewe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007